iSS-PseDNC: Identifying Splicing Sites Using Pseudo Dinucleotide Composition

نویسندگان

  • Wei Chen
  • Peng-Mian Feng
  • Hao Lin
  • Kuo-Chen Chou
چکیده

In eukaryotic genes, exons are generally interrupted by introns. Accurately removing introns and joining exons together are essential processes in eukaryotic gene expression. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapid and effective detection of splice sites that play important roles in gene structure annotation and even in RNA splicing. Although a series of computational methods were proposed for splice site identification, most of them neglected the intrinsic local structural properties. In the present study, a predictor called "iSS-PseDNC" was developed for identifying splice sites. In the new predictor, the sequences were formulated by a novel feature-vector called "pseudo dinucleotide composition" (PseDNC) into which six DNA local structural properties were incorporated. It was observed by the rigorous cross-validation tests on two benchmark datasets that the overall success rates achieved by iSS-PseDNC in identifying splice donor site and splice acceptor site were 85.45% and 87.73%, respectively. It is anticipated that iSS-PseDNC may become a useful tool for identifying splice sites and that the six DNA local structural properties described in this paper may provide novel insights for in-depth investigations into the mechanism of RNA splicing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition

Occurring at cytosine (C) of RNA, 5-methylcytosine (m5C) is an important post-transcriptional modification (PTCM). The modification plays significant roles in biological processes by regulating RNA metabolism in both eukaryotes and prokaryotes. It may also, however, cause cancers and other major diseases. Given an uncharacterized RNA sequence that contains many C residues, can we identify which...

متن کامل

iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition

Meiotic recombination is an important biological process. As a main driving force of evolution, recombination provides natural new combinations of genetic variations. Rather than randomly occurring across a genome, meiotic recombination takes place in some genomic regions (the so-called 'hotspots') with higher frequencies, and in the other regions (the so-called 'coldspots') with lower frequenc...

متن کامل

rDNAse: R package for generating various numerical representation schemes of DNA sequences

The rDNAse R package can generate various feature vectors for DNA sequences, this R package could: 1) Calculate three nucleic acid composition features describing the local sequence information by means of kmers (subsequences of DNA sequences); 2) Calculate six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specif...

متن کامل

Recombination Hotspot/Coldspot Identification Combining Three Different Pseudocomponents via an Ensemble Learning Approach

Recombination presents a nonuniform distribution across the genome. Genomic regions that present relatively higher frequencies of recombination are called hotspots while those with relatively lower frequencies of recombination are recombination coldspots. Therefore, the identification of hotspots/coldspots could provide useful information for the study of the mechanism of recombination. In this...

متن کامل

hnRNP A1 controls HIV-1 mRNA splicing through cooperative binding to intron and exon splicing silencers in the context of a conserved secondary structure.

The removal of the second intron in the HIV-1 rev/tat pre-mRNAs, which involves the joining of splice site SD4 to SA7, is inhibited by hnRNP A1 by a mechanism that requires the intronic splicing silencer (ISS) and the exon splicing silencer (ESS3). In this study, we have determined the RNA secondary structure and the hnRNP A1 binding sites within the 3' splice site region by phylogenetic compar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2014  شماره 

صفحات  -

تاریخ انتشار 2014